智能论文笔记

The RareDis corpus: a corpus annotated with rare diseases, their signs and symptoms

Claudia Martínez-deMiguel , Isabel Segura-Bedmar , Esteban Chacón-Solano , Sara Guerrero-Aspizua

分类：自然语言处理 | 机器学习

2021-08-02

Raredis Corpus含有超过5,000个罕见疾病，近6,000个临床表现都是注释。此外，跨候注释协议评估表明，相对高的协议（F1措施等于实体的完全匹配标准，与关系的81.3％等于83.5％）。基于这些结果，该毒品具有高质量，假设该领域的重要步骤由于稀缺具有稀有疾病的可用语料库。这可以将门打开到进一步的NLP应用，这将促进这些罕见疾病的诊断和治疗，因此将大大提高这些患者的生活质量。

translated by 谷歌翻译

Benchmarking person re-identification datasets and approaches for practical real-world implementations

Jose Huaman , Felix O. Sumari , Luigy Machaca , Esteban Clua , Joris Guerin

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-20

Recently, Person Re-Identification (Re-ID) has received a lot of attention. Large datasets containing labeled images of various individuals have been released, allowing researchers to develop and test many successful approaches. However, when such Re-ID models are deployed in new cities or environments, the task of searching for people within a network of security cameras is likely to face an important domain shift, thus resulting in decreased performance. Indeed, while most public datasets were collected in a limited geographic area, images from a new city present different features (e.g., people's ethnicity and clothing style, weather, architecture, etc.). In addition, the whole frames of the video streams must be converted into cropped images of people using pedestrian detection models, which behave differently from the human annotators who created the dataset used for training. To better understand the extent of this issue, this paper introduces a complete methodology to evaluate Re-ID approaches and training datasets with respect to their suitability for unsupervised deployment for live operations. This method is used to benchmark four Re-ID approaches on three datasets, providing insight and guidelines that can help to design better Re-ID pipelines in the future.

translated by 谷歌翻译

A Frequency-Structure Approach for Link Stream Analysis

Esteban Bautista , Matthieu Latapy

分类：机器学习

2022-12-07

A link stream is a set of triplets $(t, u, v)$ indicating that $u$ and $v$ interacted at time $t$. Link streams model numerous datasets and their proper study is crucial in many applications. In practice, raw link streams are often aggregated or transformed into time series or graphs where decisions are made. Yet, it remains unclear how the dynamical and structural information of a raw link stream carries into the transformed object. This work shows that it is possible to shed light into this question by studying link streams via algebraically linear graph and signal operators, for which we introduce a novel linear matrix framework for the analysis of link streams. We show that, due to their linearity, most methods in signal processing can be easily adopted by our framework to analyze the time/frequency information of link streams. However, the availability of linear graph methods to analyze relational/structural information is limited. We address this limitation by developing (i) a new basis for graphs that allow us to decompose them into structures at different resolution levels; and (ii) filters for graphs that allow us to change their structural information in a controlled manner. By plugging-in these developments and their time-domain counterpart into our framework, we are able to (i) obtain a new basis for link streams that allow us to represent them in a frequency-structure domain; and (ii) show that many interesting transformations to link streams, like the aggregation of interactions or their embedding into a euclidean space, can be seen as simple filters in our frequency-structure domain.

translated by 谷歌翻译

Denoising diffusion probabilistic models for probabilistic energy forecasting

Esteban Hernandez Capel , Jonathan Dumas

分类：机器学习 | 人工智能

2022-12-06

Scenario-based probabilistic forecasts have become a vital tool to equip decision-makers to address the uncertain nature of renewable energies. To that end, this paper presents a recent promising deep learning generative approach called denoising diffusion probabilistic models. It is a class of latent variable models which have recently demonstrated impressive results in the computer vision community. However, to the best of our knowledge, there has yet to be a demonstration that they can generate high-quality samples of load, PV, or wind power time series, crucial elements to face the new challenges in power systems applications. Thus, we propose the first implementation of this model for energy forecasting using the open data of the Global Energy Forecasting Competition 2014. The results demonstrate this approach is competitive with other state-of-the-art deep learning generative models, including generative adversarial networks, variational autoencoders, and normalizing flows.

translated by 谷歌翻译

ViTAL: Vision-Based Terrain-Aware Locomotion for Legged Robots

Shamel Fahmi , Victor Barasuol , Domingo Esteban , Octavio Villarreal , Claudio Semini

分类：机器人

2022-12-02

This work is on vision-based planning strategies for legged robots that separate locomotion planning into foothold selection and pose adaptation. Current pose adaptation strategies optimize the robot's body pose relative to given footholds. If these footholds are not reached, the robot may end up in a state with no reachable safe footholds. Therefore, we present a Vision-Based Terrain-Aware Locomotion (ViTAL) strategy that consists of novel pose adaptation and foothold selection algorithms. ViTAL introduces a different paradigm in pose adaptation that does not optimize the body pose relative to given footholds, but the body pose that maximizes the chances of the legs in reaching safe footholds. ViTAL plans footholds and poses based on skills that characterize the robot's capabilities and its terrain-awareness. We use the 90 kg HyQ and 140 kg HyQReal quadruped robots to validate ViTAL, and show that they are able to climb various obstacles including stairs, gaps, and rough terrains at different speeds and gaits. We compare ViTAL with a baseline strategy that selects the robot pose based on given selected footholds, and show that ViTAL outperforms the baseline.

translated by 谷歌翻译

Identifying latent activity behaviors and lifestyles using mobility data to describe urban dynamics

Yanni Yang , Alex Pentland , Esteban Moro

分类：机器学习 | (统计)机器学习

2022-09-24

城市化及其问题需要对城市动态，尤其是现代城市复杂而多样化的生活方式的深入和全面的了解。数字化的数据可以准确捕获复杂的人类活动，但缺乏人口统计数据的解释性。在本文中，我们研究了美国11个都会区的120万人到110万个地方的出行探访模式的隐私增强数据集，以检测美国最大的美国城市中的潜在行动行为和生活方式。尽管出行访问的复杂性很大，但我们发现生活方式可以自动分解为12种潜在的可解释的活动行为，人们如何将购物，饮食，工作或利用空闲时间结合起来。我们没有描述具有单一生活方式的人，而是发现城市居民的行为是这些行为的混合。那些被检测到的潜在活动行为同样存在于城市之间，无法通过主要人口特征来完全解释。最后，我们发现这些潜在行为与在控制人口特征之后，即使在控制人口特征之后，这些潜在行为也与经验丰富的收入隔离，运输或健康行为有关。我们的结果表明，与活动行为相辅相成，以了解城市动态的重要性。

translated by 谷歌翻译

TrADe Re-ID -- Live Person Re-Identification using Tracking and Anomaly Detection

Luigy Machaca , F. Oliver Sumari H , Jose Huaman , Esteban Clua , Joris Guerin

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-14

人重新识别（RE-ID）旨在在相机网络中寻找感兴趣的人（查询）。在经典的重新设置中，查询查询在包含整个身体的正确裁剪图像的画廊中。最近，引入了实时重新ID设置，以更好地代表Re-ID的实际应用上下文。它包括在简短的视频中搜索查询，其中包含整个场景帧。最初的实时重新ID基线使用行人探测器来构建大型搜索库和经典的重新ID模型，以在画廊中找到查询。但是，产生的画廊太大，包含低质量的图像，从而降低了现场重新ID性能。在这里，我们提出了一种称为贸易的新现场重新ID方法，以产生较低的高质量画廊。贸易首先使用跟踪算法来识别画廊中同一个人的图像序列。随后，使用异常检测模型选择每个轨道的单个良好代表。贸易已在PRID-2011数据集的实时重新ID版本上进行了验证，并显示出比基线的显着改进。

translated by 谷歌翻译

Fuzzy Clustering by Hyperbolic Smoothing

David Masis , Esteban Segura , Javier Trejos , Adilson Xavier

分类： (统计)机器学习 | 机器学习

2022-07-09

我们提出了一种使用平滑数值方法来构建大型数据集的模糊簇的新方法。通常会放宽方面的标准，因此在连续的空间上进行了良好的模糊分区的搜索，而不是像经典方法\ cite {hartigan}那样的组合空间。平滑性可以通过使用无限类别的可区分函数，从强烈的非差异问题转换为优化的可区别子问题。为了实现算法，我们使用了统计软件$ r $，并将获得的结果与Bezdek提出的传统模糊$ C $ - 表示方法进行了比较。

translated by 谷歌翻译

Evaluating object detector ensembles for improving the robustness of artifact detection in endoscopic video streams

Pedro Esteban Chavarrias-Solano , Carlos Axel Garcia-Vega , Francisco Javier Lopez-Tiro , Gilberto Ochoa-Ruiz , Thomas Bazin , Dominique Lamarque , Christian Daul

分类：计算机视觉

2022-06-15

在此贡献中，我们使用一种合奏深度学习方法来组合两个单个单阶段探测器（即Yolov4和Yolact）的预测，目的是检测内窥镜图像中的伪像。这种整体策略使我们能够改善各个模型的鲁棒性，而无需损害其实时计算功能。我们通过训练和测试两个单独的模型和各种集合配置在“内窥镜伪影检测挑战”数据集中证明了方法的有效性。广泛的实验表明，在平均平均精度方面，合奏方法比单个模型和以前的作品的优越性。

translated by 谷歌翻译

The Maximum Linear Arrangement for trees under projectivity and planarity

Lluís Alemany-Puig , Juan Luis Esteban , Ramon Ferrer-i-Cancho

分类：自然语言处理

2022-06-14

最大线性布置问题（MAXLA）包括从图$ g $的$ n $顶点查找映射$ \ pi $到最大化$ d _ {\ pi}（g）= \ sum_ {uv \ {uv \ {uv \ \ \在e（g）} | \ pi（u） - \ pi（v）| $。在这种情况下，顶点被认为位于水平线上，边缘在线上上方的半圆时绘制。存在限制安排的MaxLA的变体。在平面变体中，边缘交叉被禁止。在塑料树排列的投射变体中，是平面，根不能被任何边缘覆盖。在这里，我们提出$ o（n）$ - 时间和$ o（n）$ - 空间算法，这些算法可以解决树木的平面和射击maxla。我们还证明了最大投影和平面布置的几个属性。

translated by 谷歌翻译